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Abstract. Multivariate machine learning techniques provide an alternative to the rapidity gap 
method for event-by-event identification and classification of diffraction in hadron-hadron colli- 
sions. Traditionally, such methods assign each event exclusively to a single class producing classi- 
fication errors in overlap regions of data space. As an alternative to this so called hard classification 
approach, we propose estimating posterior probabilities of each diffractive class and using these 
estimates to weigh event contributions to physical observables. It is shown with a Monte Carlo 
study that such a soft classification scheme is able to reproduce observables such as multiplicity 
distributions and relative event rates with a much higher accuracy than hard classification. 
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INTRODUCTION 

Diffraction is usually identified based on large rapidity gaps (LRG) although it is 

widely acknowledged that this requirement alone leads to insufficient separation be- 
tween diffractive and non-diffractive events. This is due to long range correlations that 
may destroy the LRG. In fact, the gap survival probability of single diffractive events 
at LHC energies is only of the order of 10% [1]. Additionally, because of fluctuations 
in hadronization, also the non-diffractive background contains a non-negligible amount 
of LRG events [2]. Moreover, a rapidity gap may just be an experimental artifact due 
to high detection thresholds. Hence, in order to achieve more efficient identification of 
diffraction, alternatives to a simple cut on At] should be investigated. 

In this paper, we study the use of multivariate classification algorithms for identifica- 
tion of diffraction and also discriminating between single diffractive and double diffrac- 
tive events. Such an approach does not explicitly look for rapidity gaps, but instead 
considers the full event topology in an optimal manner. That is, instead of heuristically 
determining the type of events to look for, such algorithms are able to learn the event 
characteristics providing the best discriminative power based on a suitably selected train- 
ing set of labeled events. 

Most classification algorithms perform a mapping of each observation into a single 
class. We call these hard classification algorithms, examples of which are neural net- 
works and support vector machines. In our case, hard classification corresponds to clas- 
sifying each event as either single diffractive, with the diffractive system on the left 



(SDL) or the right side (SDR), double diffractive (DD) or non-diffractive (ND)^ As 
there is inherent mixing between these classes, such an approach is bound to produce 
classification errors in the overlap regions of the data space. This is especially the case 
with DD events which often exhibit characteristics similar to SD and ND events. For 
this reason, instead of considering a single class only, we propose estimating the prob- 
abilities for each event to belong to each of the classes. We then use these probabilities 
to weigh the contribution of an event to physical observables. In the spirit of [4], we call 
such an approach soft classification. 

SOFT CLASSIFICATION METHODOLOGY 

In this work, we estimate the posterior probability of an event x to belong to class Q 
using the k nearest neighbors (^NN) algorithm^ for which p{Ci\x) — kj/k, where ki is the 
number of observations from class Q among the k nearest neighbors of x in the training 
set [5]. The nearest neighbors are found using the Euclidean distance although other 
distance metrics can be used as well. In addition to soft classification, kNN can also 
be used for hard classification in which case the class is selected based on the highest 
posterior probability. 

Because of an effect known as curse of dimensionality, the performance of the kNN 
algorithm can be significantly improved by reducing the dimensionality of the data. To 
this end, we use the linear discriminant analysis (LDA) algorithm which is a dimension- 
ality reduction algorithm for labeled data [5]. It performs a mapping x i-> Wx from the 
original D-dimensional space into a subspace with dimensionality d = C —I, where C 
is the number of classes. The matrix W is chosen such that the distance between the 
classes is maximized and the spread of each class is minimized. 

SOFT CLASSIFICATION OF DIFFRACTION 

To study the feasibility of soft classification for distinguishing between the different 
diffractive classes, we generated a sample of ^/s = 1 TeV minimum bias events using 
PYTHIA6 with the D6T tune [6]. The sample contained SDL, SDR, DD and ND events 
in ratios determined by the MC tune. Starting from this generator level information, 
we calculated energy deposits and charged particle multiplicities registered by the IPS 
detectors at the LHC based on their geometric acceptances. By dividing the CMS central 
tracker into 3 7] bins and Tl and T2 on both sides into 2 bins, multiplicity was recorded 
in 11 T] bins. The same amount of bins was also used for energy deposits corresponding 
to division of the central calorimeters into 3 bins, HF on both sides into 2 bins and a 
single bin each for CASTOR and the Zero Degree Calorimeters (ZDC) on both sides of 
the interaction point. In the case of the ZDC, only the energy of neutral particles was 
recorded. No thresholds or other detector effects were included in the simulations. By 



See [3] for a feasibility study of such a classification scheme. 
^ We also experimented with more advanced soft classification methods such as kemel density estimation 
and non-linear discriminant analysis but they gave no advantage over kNN. 
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FIGURE 1. Charged particle multiplicity distributions for diffractive events when event categories are 
determined using different classification schemes. The distribution for the right side single diffraction is 
essentially a mirror image of the left side distribution shown here. The plots allow comparison between 
correctly labeled data (Pythia6), soft classification (soft A:NN) and hard classification (hard A;NN and 
neural network). At central rapidities, hard classification underestimates all the diffractive contributions 
while soft classification is able to better reproduce the correct distributions. The accuracy of all the 
algorithms is impaired at large |tj |, where information from only the ZDC detector is available. 



computing also the scalar sum of pj and the invariant mass of charged particles within 
1 77 1 < 2.5, each event was represented by 24-dimensionaI data vector x. 

The MC sample was divided into training, validation and test sets each containing 
50000 events followed by a normalization with the mapping x,- i-)- log(x/ + 1). After 
further normalization for mean and variance, the dimensionality of the events was 
reduced to 3 using LDA. The optimal value of the parameter k for this data was found 
based on maximization of efficiency on the validation set. The kNN algorithm was 
then used to perform both soft and hard classification of the test set. As an additional 
benchmark, we also trained an MLP neural network [5] with 10 hidden nodes on a single 
hidden layer to perform hard classification of the same test set. 

The classification results were then used to reconstruct the multiplicity distributions 
of the different event types. The obtained diffractive distributions shown in Figure 1 
indicate that soft classification is able to better reproduce the correct distributions than 
hard classification. Note also that both hard classification algorithms produce very 
similar outputs while the results of soft classification are qualitatively different from 
this. Similar results were also obtained for the distribution. We also observed that 
the relative event rates estimated using the soft ^NN algorithm are very accurate (see 
Table 1) and clearly better than the ones given by the hard methods. 



TABLE 1. Relative event rates and their deviations from Pythia6 with the dif- 
ferent classification schemes. Soft kNN is able to estimate the rates with a very high 
accuracy while both hard classification algorithms overestimate the non-diffractive 
contribution and underestimate all the diffractive classes. 





ND 


DD 




SDR 


SDL 


PYTHIA6 


67.84 
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9.72 
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Soft kNN 


67.66 (-0.18) 


13.07 


(H-0.07) 


9.78 (H-0.06) 


9.48 (H-0.04) 


HardA;NN 


70.13 (-1-2.29) 


11.67 


(-1.33) 


9.52 (-0.20) 


8.67 (-0.77) 


Neural network 


69.67 (-Hi. 83) 


12.15 


(-0.85) 


8.97 (-0.75) 


9.20 (-0.24) 



CONCLUSIONS 

We propose a probabilistic multivariate approach called soft classification for identifica- 
tion and classification of diffraction. The results obtained using the soft kNN algorithm 
on a generator level MC sample show that the approach accurately reproduces physical 
observables. Soft classification could hence serve as an alternative to the rapidity gap 
method. The main drawback of the approach is its dependency on the selection of the 
training set which makes the classification MC dependent. The severity of this depen- 
dence is a subject of an ongoing study, the preliminary results of which suggest that soft 
classification is more robust against a misspecified training set than the hard methods. 
In some cases, it might also be possible to use data-driven methods for constructing 
the training set. The natural next step of the study is to employ detector level MC and 
eventually perform a full physics analysis using real data. 
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